Conversation
- extend reduce_sum to accept any numpy dtype (float64, int64, bool, etc.) falling back to numpy with appropriate accumulator (float64 for floats, int64 for integers/bool) instead of raising ValueError - update test_unsupported_dtype_raises → split into mean/std test and a new test_reduce_sum_accepts_any_dtype covering float64, int64, bool Made-with: Cursor
Reviewer's GuideExtends reduce_sum to handle any NumPy dtype by routing non-uint8/float32 arrays through a generic NumPy-based reduction with appropriate accumulator dtypes, updates tests to reflect the new behavior, and bumps the package version to 0.1.5. Sequence diagram for extended reduce_sum dtype handlingsequenceDiagram
actor User
participant reduce_sum
participant _reduce_sum_global_uint8
participant _reduce_sum_global_float32
participant _reduce_sum_per_channel_uint8
participant _reduce_sum_per_channel_float32
participant _reduce_sum_numpy
participant numpy
User->>reduce_sum: reduce_sum(arr, axis, keepdims)
alt axis is None or global
alt _is_uint8_image(arr)
reduce_sum->>_reduce_sum_global_uint8: _reduce_sum_global_uint8(arr, keepdims)
_reduce_sum_global_uint8-->>reduce_sum: uint64 or ndarray
else _is_float32_image(arr)
reduce_sum->>_reduce_sum_global_float32: _reduce_sum_global_float32(arr, keepdims)
_reduce_sum_global_float32-->>reduce_sum: float32 or ndarray
else other dtype
reduce_sum->>_reduce_sum_numpy: _reduce_sum_numpy(arr, axes=None, keepdims)
_reduce_sum_numpy->>numpy: sum(arr, axis=None, dtype=acc, keepdims)
numpy-->>_reduce_sum_numpy: result
_reduce_sum_numpy-->>reduce_sum: result
end
else axis equals _per_channel_spatial_axes(arr)
alt _is_uint8_image(arr)
reduce_sum->>_reduce_sum_per_channel_uint8: _reduce_sum_per_channel_uint8(arr, keepdims)
_reduce_sum_per_channel_uint8-->>reduce_sum: uint64 or ndarray
else _is_float32_image(arr)
reduce_sum->>_reduce_sum_per_channel_float32: _reduce_sum_per_channel_float32(arr, axes, keepdims)
_reduce_sum_per_channel_float32-->>reduce_sum: float32 or ndarray
else other dtype
reduce_sum->>_reduce_sum_numpy: _reduce_sum_numpy(arr, axes, keepdims)
_reduce_sum_numpy->>numpy: sum(arr, axis=axes, dtype=acc, keepdims)
numpy-->>_reduce_sum_numpy: result
_reduce_sum_numpy-->>reduce_sum: result
end
else explicit axes
reduce_sum->>_reduce_sum_numpy: _reduce_sum_numpy(arr, axes, keepdims)
_reduce_sum_numpy->>numpy: sum(arr, axis=axes, dtype=acc, keepdims)
numpy-->>_reduce_sum_numpy: result
_reduce_sum_numpy-->>reduce_sum: result
end
reduce_sum-->>User: reduced result
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
_reduce_sum_numpy, the return type annotation (np.uint64 | np.float64 | np.ndarray) no longer quite matches the behavior now that integer/bool paths can returnnp.int64scalars; consider tightening the hint (e.g., tonp.generic | np.ndarray) or explicitly includingnp.int64. - The accumulator selection in
_reduce_sum_numpytreats all non-uint8integer and bool dtypes asint64; if you expect larger unsigned types (e.g.,uint32/uint64), consider branching on signedness and itemsize to avoid surprising overflow or sign changes.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_reduce_sum_numpy`, the return type annotation (`np.uint64 | np.float64 | np.ndarray`) no longer quite matches the behavior now that integer/bool paths can return `np.int64` scalars; consider tightening the hint (e.g., to `np.generic | np.ndarray`) or explicitly including `np.int64`.
- The accumulator selection in `_reduce_sum_numpy` treats all non-`uint8` integer and bool dtypes as `int64`; if you expect larger unsigned types (e.g., `uint32`/`uint64`), consider branching on signedness and itemsize to avoid surprising overflow or sign changes.
## Individual Comments
### Comment 1
<location path="tests/test_stats.py" line_range="364-371" />
<code_context>
- reduce_sum(arr, "per_channel")
+
+
+@pytest.mark.parametrize("dtype", [np.int32, np.int64, np.float64, np.bool_])
+def test_reduce_sum_accepts_any_dtype(dtype: type) -> None:
+ arr = np.ones((4, 4, 3), dtype=dtype)
+ result = reduce_sum(arr)
+ assert result == arr.size
+ per_ch = reduce_sum(arr, "per_channel")
+ assert per_ch.shape == (3,)
+ assert np.all(per_ch == 16)
</code_context>
<issue_to_address>
**suggestion (testing):** Extend `test_reduce_sum_accepts_any_dtype` to actually exercise the accumulator behavior (overflow/precision) rather than just trivial ones.
This test only checks that `reduce_sum` works for various dtypes and preserves shape/value semantics. It doesn’t verify the new accumulator behavior (`float64` for floats, `int64` for ints/bools, `uint64` for `uint8`). With all-ones inputs, you won’t see overflow or precision issues. Please add a parametrized test (or subtests here) with inputs near `int32`/`int64` limits and a float case sensitive to precision, then assert that the result and/or result dtype matches `np.sum(arr, dtype=<expected_acc>)`. That will lock in the chosen accumulator dtypes and catch regressions.
</issue_to_address>
### Comment 2
<location path="tests/test_stats.py" line_range="353-354" />
<code_context>
@pytest.mark.parametrize("dtype", [np.int32, np.float64])
-def test_unsupported_dtype_raises(dtype: type) -> None:
+def test_unsupported_dtype_raises_mean_std(dtype: type) -> None:
arr = np.ones((2, 2, 1), dtype=dtype)
with pytest.raises(ValueError, match="Unsupported dtype"):
</code_context>
<issue_to_address>
**suggestion (testing):** Broaden `test_unsupported_dtype_raises_mean_std` to cover additional unsupported dtypes (e.g. bool, complex) to better document mean/std constraints.
The new name is clearer, but the test still only covers `np.int32` and `np.float64`. Since `mean`/`std`/`mean_std` should reject all unsupported dtypes, please add at least one non-integer/non-float case (e.g. `np.bool_` and/or `np.complex64`) to assert that these also raise `ValueError`. This will better document the supported dtypes and guard against unintentionally accepting new dtypes without updating tests.
```suggestion
@pytest.mark.parametrize("dtype", [np.int32, np.float64, np.bool_, np.complex64])
def test_unsupported_dtype_raises_mean_std(dtype: type) -> None:
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| @pytest.mark.parametrize("dtype", [np.int32, np.float64]) | ||
| def test_unsupported_dtype_raises(dtype: type) -> None: |
There was a problem hiding this comment.
suggestion (testing): Broaden test_unsupported_dtype_raises_mean_std to cover additional unsupported dtypes (e.g. bool, complex) to better document mean/std constraints.
The new name is clearer, but the test still only covers np.int32 and np.float64. Since mean/std/mean_std should reject all unsupported dtypes, please add at least one non-integer/non-float case (e.g. np.bool_ and/or np.complex64) to assert that these also raise ValueError. This will better document the supported dtypes and guard against unintentionally accepting new dtypes without updating tests.
| @pytest.mark.parametrize("dtype", [np.int32, np.float64]) | |
| def test_unsupported_dtype_raises(dtype: type) -> None: | |
| @pytest.mark.parametrize("dtype", [np.int32, np.float64, np.bool_, np.complex64]) | |
| def test_unsupported_dtype_raises_mean_std(dtype: type) -> None: |
There was a problem hiding this comment.
Pull request overview
This PR bumps albucore to v0.1.5 and updates reduce_sum to support a broader set of NumPy dtypes by falling back to numpy.sum with dtype-dependent accumulator selection, alongside updated tests.
Changes:
- Bump package version from
0.1.4→0.1.5. - Extend
reduce_sumto accept non-uint8/float32arrays via NumPy fallback with accumulator selection. - Update/rename the unsupported-dtype test for mean/std and add a new test validating
reduce_sumbehavior on additional dtypes.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
albucore/stats.py |
Allows reduce_sum to fall back to NumPy for non-optimized dtypes and updates documentation. |
tests/test_stats.py |
Splits unsupported-dtype test coverage and adds coverage for broader reduce_sum dtype support. |
pyproject.toml |
Bumps project version to 0.1.5. |
uv.lock |
Updates locked package version to 0.1.5. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if arr.dtype == np.uint8: | ||
| acc: type = np.uint64 | ||
| elif np.issubdtype(arr.dtype, np.floating): | ||
| acc = np.float64 |
There was a problem hiding this comment.
_reduce_sum_numpy selects np.int64 as the accumulator for all non-uint8, non-floating dtypes. This includes unsigned integer arrays (e.g. uint16/uint32/uint64), where casting to int64 can produce incorrect negative results for values > 2**63-1 and is generally inconsistent with unsigned semantics. Consider using a np.uint64 accumulator for np.unsignedinteger dtypes (and keep np.int64 for signed ints/bool), so sums remain correct across all integer dtypes.
| acc = np.float64 | |
| acc = np.float64 | |
| elif np.issubdtype(arr.dtype, np.unsignedinteger): | |
| acc = np.uint64 |
| Args: | ||
| arr: ``uint8`` or ``float32`` array with explicit channel dimension. | ||
| arr: Array with explicit channel dimension. Optimised paths for ``uint8`` and ``float32``; | ||
| other dtypes fall back to NumPy (float → float64 accumulator, integer/bool → int64). | ||
| axis: ``None`` / ``"global"`` → one scalar; ``"per_channel"`` → shape ``(C,)``; |
There was a problem hiding this comment.
The docstring now states that reduce_sum accepts dtypes beyond uint8/float32 via a NumPy fallback, but the function signature still types arr as ImageType (uint8|float32). This mismatch will cause static type checkers to reject valid runtime usage (e.g. int64/bool). Consider widening arr’s type annotation (e.g. np.ndarray / NDArray[Any]) or adding a broader alias for stats inputs so the public API matches the documented behavior.
| def _reduce_sum_numpy( | ||
| arr: ImageType, | ||
| arr: np.ndarray, | ||
| axes: tuple[int, ...] | None, | ||
| *, | ||
| keepdims: bool, | ||
| ) -> np.uint64 | np.float64 | np.ndarray: | ||
| acc = np.uint64 if arr.dtype == np.uint8 else np.float64 | ||
| if arr.dtype == np.uint8: | ||
| acc: type = np.uint64 | ||
| elif np.issubdtype(arr.dtype, np.floating): | ||
| acc = np.float64 | ||
| else: | ||
| acc = np.int64 | ||
| return np.sum(arr, axis=axes, dtype=acc, keepdims=keepdims) |
There was a problem hiding this comment.
_reduce_sum_numpy can now return np.int64 (for integer/bool inputs when axes=None), but its return annotation is still np.uint64 | np.float64 | np.ndarray. Please update the return type to include the int64 case (or use a broader scalar type like np.integer | np.floating) so type hints match runtime behavior.
- _reduce_sum_numpy: branch on np.issubdtype(unsigned) → uint64 so uint32/uint64 inputs don't silently sign-extend into int64 - return type changed from uint64|float64|ndarray to generic|ndarray since int64 and uint64 scalars are now both valid outputs - test_unsupported_dtype_raises_mean_std: add bool_ and complex64 to the parametrize list per review suggestion - add test_reduce_sum_accumulator_dtype: locks in expected accumulator dtype for int32, int64, uint32, uint64, float64, bool_ - add test_reduce_sum_overflow_and_precision: exercises near-max int32, near-max uint32 (sign safety), and float64 precision cases Made-with: Cursor
The function accepts any dtype via the numpy fallback path; keeping ImageType (uint8 | float32) caused static type checkers to reject valid runtime calls with int64, bool_, etc. Made-with: Cursor
Made-with: Cursor
Summary by Sourcery
Allow reduce_sum to operate on a wider range of NumPy dtypes while keeping optimized paths for uint8 and float32, and bump the package version.
New Features:
Enhancements:
Build:
Tests: